Paraphrase and Textual Entailment Recognition and Generation

نویسنده

  • Prodromos Malakasiotis
چکیده

Paraphrasing methods recognize, generate, or extract phrases, sentences, or longer natural language expressions that convey almost the same information. Textual entailment methods, on the other hand, recognize, generate, or extract pairs of natural language expressions, such that a human who reads (and trusts) the first element of a pair would most likely infer that the other element is also true. Paraphrasing can be seen as bidirectional textual entailment and methods from the two areas are often very similar. Both kinds of methods are useful, at least in principle, in a wide range of natural language processing applications, including question answering, summarization, text generation, and machine translation. In this thesis, we focus on paraphrase and textual entailment recognition, as well as paraphrase generation. We propose three paraphrase and textual entailment recognition methods, experimentally evaluated on existing benchmarks. The key idea is that by capturing similarities at various abstractions of the inputs, we can recognize paraphrases and textual entailment reasonably well. Additionally, we exploit WordNet and use features that operate on the syntactic level of the language expressions. The best of our three recognition methods achieves state of the art results on the widely used MSR paraphrasing corpus, but the simplest of our methods is also a very competitive baseline. On textual entailment datasets, our methods achieve worse results. Nevertheless, they perform reasonably well, despite being simpler than several other proposed methods; therefore, they can be considered as competitive baselines for future work.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Paraphrase and Textual Entailment Generation in Czech

Paraphrase and textual entailment generation can support natural language processing (NLP) tasks that simulate text understanding, e.g., text summarization, plagiarism detection, or question answering. A paraphrase, i.e., a sentence with the same meaning, conveys a certain piece of information with new words and new syntactic structures. Textual entailment, i.e., an inference that humans will j...

متن کامل

Recognizing Paraphrases And Textual Entailment Using Inversion Transduction Grammars

We present first results using paraphrase as well as textual entailment data to test the language universal constraint posited by Wu’s (1995, 1997) Inversion Transduction Grammar (ITG) hypothesis. In machine translation and alignment, the ITG Hypothesis provides a strong inductive bias, and has been shown empirically across numerous language pairs and corpora to yield both efficiency and accura...

متن کامل

TextFlow: A Text Similarity Measure based on Continuous Sequences

Text similarity measures are used in multiple tasks such as plagiarism detection, information ranking and recognition of paraphrases and textual entailment. While recent advances in deep learning highlighted further the relevance of sequential models in natural language generation, existing similarity measures do not fully exploit the sequential nature of language. Examples of such similarity m...

متن کامل

Multi-word expressions in textual inference: Much ado about nothing?

Multi-word expressions (MWE) have seen much attention from the NLP community. In this paper, we investigate their impact on the recognition of textual entailment (RTE). Using the manual Microsoft Research annotations, we first manually count and classify MWEs in RTE data. We find few, most of which are arguably unlikely to cause processing problems. We then consider the impact of MWEs on a curr...

متن کامل

Paraphrase and Textual Entailment Generation

One particular information can be conveyed by many different sentences. This variety concerns the choice of vocabulary and style as well as the level of detail (from laconism or succinctness to total verbosity). Although verbosity in written texts is considered bad style, generated verbosity can help natural language processing (NLP) systems to fill in the implicit knowledge. The paper presents...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011